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Abstract 

Structural fluctuations in the thermal equilibrium of the kinesin motor 
domain are studied using a lattice protein model with Go interactions. By 
means of the multi-self-overlap ensemble (MSOE) Monte Carlo method 
and the principal component analysis (PCA), the free-energy landscape 
is obtained. It is shown that kinesins have two subdomains that exhibit 
partial folding/unfolding at functionally important regions: one is located 
around the nucleotide binding site and the other includes the main mi- 
crotubule binding site. These subdomains are consistent with structural 
variability that was reported recently based on experimentally-obtained 
structures. On the other hand, such large structural fluctuations have 
not been captured by B-factor or normal mode analyses. Thus, they are 
beyond the elastic regime, and it is essential to take into account chain 
connectivity for studying the function of kinesins. 

INTRODUCTION 

Biomolecular motors are proteins that exhibit biolocomotion by converting 
chemical energy into mechanical energy [IJ EJ E]. A detailed mechanism of 
this energy conversion has been a long-standing problem. Unlike macroscopic 
motors, biomolecular motors operate under a strong influence of thermal fluc- 
tuations, since the chemical energy provided by ATP hydrolysis is of the same 
order as that of the thermal fluctuations. 

Kinesin, which is responsible for intracellular transport and cell division [4], 
moves stepwise in one direction along the microtubule with one ATP hydrolysis 
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per step [SJ O [7] ■ Kinesin has attracted considerable attention and was studied 
intensively in recent years. Conventional kinesin functions as a dimer; it is 
widely accepted that kinesin walks hand-over-hand on a microtubule [8|. The 
gliding movement of kinesin is believed to be realized by controlling the binding 
with and dissociation from the microtubule [DJ. Kinesin binds tightly to the 
microtubule in the nucleotide-free and ATP states, whereas it detaches from 
the microtubule in the ADP state [ID] . 

The structures of kinesin and myosin share highly conserved regions known 
as switch I and switch II, named after their counterparts in G proteins [11] : G 
proteins are signaling proteins that bind to GTP instead of ATP. The switch 
regions of the G proteins move during nucleotide hydrolysis and thereby control 
the binding with and detaching from a target protein [12]. As for kinesin, 
both the switch I and switch II regions have different conformations among 
different X-ray structures [T31II31 US] ; further, they show microtubulc-dcpcndent 
structural changes [TBJ. This strongly suggests that kinesin also controls its 
affinity with the rail protein by structural transformations of the switch regions. 

It is commonly considered that functionally important fluctuations of a pro- 
tein can be read from the B-factor of the X-ray crystal structure. For kinesin, 
however, it was reported that a large B-factor does not correspond to a (suppos- 
edly) functionally important fluctuation [13] . In particular, it was pointed out 
that the fluctuations of the switch regions are largely underevaluated. More- 
over, a normal mode analysis based on the elastic-network model has revealed 
that small fluctuations in the elastic regime are insufficient to capture the con- 
formational change of kinesin in contrast to the other molecular motors [17] , 
These results indicate that larger-scale structural fluctuations beyond the elas- 
tic regime are relevant to the function of kinesins. In order to investigate such 
large structural fluctuations, we follow a methodology of folding study in this 
work. 

The energy landscape theory of protein folding has been accepted widely 
in the last decade; this theory states that proteins have a funnel-like energy 
landscape toward the native structure [HI [19] . This new view of the protein 
folding has resulted in a revival of the Go model [5DJ [ST] , which was originally 
introduced for explaining the mechanism of two-state folding transitions of small 
globular proteins. Recent studies of protein folding have shown that Go- like 
models (namely, models constructed with the same design as the Go model) 
are applicable far beyond the original scope; they can describe the folding of 
some proteins with intermediate states [22l E31 Ell E3 EH EH EH [29]. In this 
work, we examine the large structural fluctuations of kinesins by applying a 
Go-like model for calculating a free-energy landscape. We use a realistic lattice 
protein model [281 122] , because their relatively small conformation spaces are 
fairly favorable for numerical simulations. 
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MODEL AND METHODS 



In the lattice model of its simplest form, each amino acid residue is expressed 
as a bead that is allowed to move on a simple cubic lattice, and the polypeptide 
chain is represented as a string of connected beads with a unit bond length 
[3Ql [3T] . If a pair of unconnected beads occupies neighboring sites, the pair is 
called to form a contact and the interaction is applied to it. Although such a 
lattice model can reproduce various universal properties of the protein folding, 
it is too simple to represent the chain geometry of real proteins; thus, we need 
to introduce a lattice model of a higher resolution for the present purpose. 

Some realistic lattice models that can represent flexible protein structures 
have been proposed [32] [33J [34]. In the present work, we use a 210-211 hybrid 
lattice model in which C a atoms of amino acids are again located on a simple 
cubic lattice; however, consecutive atoms are connected by vectors of the type 
(2,1,0) or (2,1,1) and all their possible permutations [28j [29]. The length of 
(1,0,0) vector is set to 1.62 A, and the length of a C"— C Q bond is then 3.63 
A or 3.97 A. In order to express the excluded volume effect, we consider that 
each amino acid occupies seven lattice sites, that is, a center site at the bead 
position and its nearest neighbors. 

We introduce Go interactions between amino acids, which act only on pairs 
of amino acids that form native contacts. We call such pair a "native pair" 
from now on. The X-ray crystal structure of the kinesin motor domain of 
Homo sapiens (PDB ID code lbg2) is used as a reference of the native structure 
(Figure 1(a)) [35]. This crystal structure is of the ADP-binding state (note that 
the ADP molecule is not included explicitly in the following simulations). The 
native structure of the lattice protein model is defined by fitting to the C" trace 
of this crystal structure. The accuracy of the fitting in terms of the root mean 
square deviation (rmsd) is 0.92 A(Figure 1(b)). The Go potential between the 
i th and j th amino acids is defined as follows: 



V g3 = - e*Ci J *{r i j,r% t ), (1) 

j-i>2 

» / , f 1 \x 2 -y 2 \ < W 

A ^ = { 0, otherwise < 2 > 

where rij is the distance between C" atoms, is their native distance, £jj is 
the interaction strength between residue i and j, and C\j is the constant that 
equals 1 or depending on whether the pair is a native pair or not. W is the 
width of the Go potential. 

We add a local potential to maintain a protein-like local structure. Here, we 
introduce a harmonic potential between the i th and (i + 2) th residues to bias 
the bond angles toward the native structure. The local potential is 



V local = ^£(r M+2 -r^ 2 ) 2 (3) 
»=l 
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where K(, is the strength of the local interaction. We use W = 2 and Kb = 
1. The interaction strength e% t j takes either the uniform value e% j = 1 or 
normalized Miyazawa-Jernigan (MJ) contact-energy parameters [38j . 

We compute the density of states and several physical quantities in thermal 
equilibrium by using a Monte Carlo scheme called multi-self-overlap ensemble 
(MSOE) Monte Carlo method [Ml SO]- In this method, the self-avoiding condi- 
tion is systematically weakened, and a flat histogram both in energy and degree 
of the excluded volume is obtained by a bivariate extension of the multicanoni- 
cal ensemble Monte Carlo method. By calculating the physical quantities only 
for the self-avoiding conformations, correct canonical averages are obtained. 
The MSOE method gives a high performance, especially for long proteins, in 
comparison with the standard multicanonical ensemble for this realistic lattice 
model. It should be noted that both lattice model and multicanonical methods 
are not suitable for studying dynamics of protein. In the present work we focus 
on equilibrium properties, the free-energy landscape in particular, which can 
be treated by the lattice model as long as the conformations are realistically 
expressed. 

RESULTS 

Simulations are prepared for the following three models: (i) C"(7.5): A pair 
of amino acids is considered to be a native pair if the distance between their 
C a atoms is less than 7.5 A in the X-ray crystal structure, (ii) All(4.5): A 
pair of amino acids is considered to be a native pair if the minimum distance 
between their heavy atoms is less than 4.5 A in the X-ray crystal structure, (hi) 
A11(4.5)/MJ: Definition of the native pair is same as All(4.5), but heterogeneous 
interactions based on normalized MJ contact-energy parameters [35] is used as 
the Go interaction. The density of state of kinesin across a wide energy range 
including both the native state and unfolded state are calculated by MSOE. Fig- 
ure 2(a) shows the entropy of the kinesin motor domain. The figure represents 
the folding funnel if rotated by 90 degrees. 

Free-Energy Landscape and Fluctuation of C Q (7.5) Model 

We first focus on the C a (7.5) model. The energy shown in Figure 2(b) (solid 
line) jumps at two temperatures. Both the jumps correspond to cooperative 
transitions. The gyration radius significantly changes at the higher transition 
point, and slightly at the lower transition point (Figure 2(c)). As shown below, 
the hydrophobic core is broken and kinesin unfolds completely above the higher 
transition point. Thus, the higher transition is the main folding/unfolding tran- 
sition and is considered to be irrelevant to the stepping motion. On the other 
hand, folding/unfolding of local structures takes place at the lower transition 
point with the hydrophobic core maintained folded. 

In order to examine each transition in detail, we carry out the principal com- 
ponent analysis (PCA) for contact formation of the native pairs. The variable 
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Xk is introduced, where the index k runs all the native pairs; Xk = 1 if fc-th 
native pair is in contact and otherwise. The variance-covariance matrix 

Mm = (x k xi) - (x k )(xi) (4) 

is calculated, where () indicates thermal average. Then Mki is diagonalized to 
yield the eigenvalues and eigenvectors. The eigenvectors are called the principal 
components (PC); PC expresses which contacts are formed/unformed coopera- 
tively, and PC with a large eigenvalue represents an important structural fluc- 
tuation. We calculate the average value of amplitudes in PC of all the contacts 
belonging to each residue. Residues are regarded to constitute a fluctuation 
unit, if they have relatively large averages in the same PC. 

At the higher transition point, only one eigenvalue is large and the fluctua- 
tion unit is located at the hydrophobic core [central anti-parallel /3-sheet (Fig- 
ure 1(a))]. The eigenvalue distribution at the lower transition point is shown in 
Figure 3(a). There are two large eigenvalues. The fluctuation unit of the first 
eigenvector (PCI) is composed of helix 4, loop 12, helix 5, helix 6, and strands 
5a/b (insertion of strand 5), and that of the second eigenvector (PC2) is com- 
posed of helix 3 and helix 2 [Figure 3(b)] We found that structural fluctuations 
of these fluctuation units actually are partial folding and unfolding, because the 
components in the principal eigenvectors have the same sign. The former region 
overlaps with the switch II region and the latter with the switch I region. Hence, 
we call these regions SWII subdomain and SWI subdomain, respectively. 

Figure 3(c) shows the free-energy landscape in two-dimensional space spanned 
by the first two principal components PCI and PC2 at the transition point. Four 
states are distinguishable as four minima in the free-energy landscape. They 
are classified according to the formation of SWI and SWII: both SWI and SWII 
are unfolded (UU state); only SWI is formed (FU state); only SWII is formed 
(UF state); both the subdomains are formed (FF state). The two subdomains 
behave as mutually independent folding units because they belong to different 
eigenvectors. In other words, partial folding/unfolding transitions of FF «-> FU 
<-> UU and FF <-> UF <-» UU occur independently. It should be noted that the 
FU <-> UF transition is not realized because of the free-energy barrier. 

If these fluctuation units just reflect intrinsic properties of the lower transi- 
tion point and do not have any relevance for other temperature, they may not 
be of interest from the viewpoint of function of kinesin. In order to demonstrate 
that this is not the case and the fluctuation units are rather robust, we show the 
energy dependence of the mean contact probability of the native contacts for 
each residue in Figure 3(d). Five states are distinguishable, each of which cor- 
responds to a different energy range. The highest energy range corresponds to 
the completely unfolded state, so that it is not relevant to the lower transition. 
The /3-sheet hydrophobic core is formed in the other four states. They actually 
correspond to UU, FU, UF and FF states described above and appear in Figure 
3(d) in this order from a higher energy to a lower energy. Thus, we can safely say 
that SWI and SWII subdomains behave as two independent fluctuation units 
as long as the hydrophobic core is formed. 



5 



Units of Fluctuation Are Robust 

Let us now turn to the All(4.5) and A11(4.5)/MJ models. As is seen in Figure 
2(b), only a single jump is observed in the energy. The protein is completely 
unfolded at temperatures above this jump. Hence, in these cases, the fold- 
ing/unfolding of local structures are not separated from that of the hydrophobic 
core with respect to the temperature. Thus, PCA is not suitable to describe 
partial folding/unfolding. Nevertheless, we can discuss the folding unit from the 
energy dependence of the mean contact probability of the native contacts. As 
is seen in Figures 4(a) and 4(b), five states - unfold, UU, UF, FU, and FF - are 
again distinguishable. In other words, they are separated in energy similar to 
the C a (7.5) models. Thus, two subdomains - SWI and SWII - are also included 
as the folding units in both the All(4.5) and A11(4.5)/MJ models. The structure 
of the boundary of the subdomains is slightly different; strand 1 is included in 
the hydrophobic core in the C a (7.5) model, but in the SWII subdomain in the 
All(4.5 ) and A11(4.5)/MJ models. Strand 5a/b is included in the SWII subdo- 
main in the C Q (7.5) model, but not in the All(4.5) and A11(4.5)/MJ models. In 
brief, although the number of transitions is different for different models, the 
overall structure of folding units (subdomains) is robust. 

DISCUSSION 

The simulation result indicate that the kinesin motor domain has two subdo- 
mains other than the hydrophobic core: SWI and SWII, which overlap with 
the switch I and switch II regions, respectively. These two subdomains ex- 
hibit structural fluctuations, which actually are partial folding/unfolding, and 
nearly independent when the nucleotide is absent. The largest fluctuation is 
localized at the SWII subdomain. A question may occur whether these subdo- 
mains are relevant to the real kinesin protein or not. Quite recently, Grant et 
al. studied structural variability of kinesin using 37 available structures in the 
protein data bank (PDB) [41]. The PCA revealed that there are two significant 
principal components. The first principal component describes the concerted 
displacement located at helix 4, loop 12, helix 5, and loop 13. This subdo- 
main corresponds to the central part of the SWII subdomain we found. They 
suggested that the second principal component is localized at helix 3 and the 
vicinity of loop 6 and loop 10. Seeing their result, we found that the component 
has a large value at helix 2, which actually is located in the vicinity of loop 6. 
Although they did not call this region as subdomain explicitly, the region and 
the SWI subdomain overlap in large part with each other. Thus, our simulation 
result is consistent to their analysis. This indicates that the SWI and SWII 
subdomains successfully captured the functionally important fluctuations. 

These two subdomains are considered to play important roles in the step- 
ping motion, based on the following experimental facts: (i) Helix 4, loop 12, 
and helix 5 belonging to the SWII subdomain are known to undergo a large 
conformational change during the ATP hydrolysis cycle [15]; this region also 
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includes the main microtubule binding site [131 [4j5] . Moreover, the SWII subdo- 
main interacts directly with the neck linker, which connects the motor domain 
and neck region (the neck linker is not included in our simulation), because the 
neck linker is adjacent in sequence to helix 6 (included in the SWII subdomain). 
(ii) Helix 3 in the SWI subdomain also undergoes a large conformational change 
during the ATP hydrolysis cycle [44 . In addition, judging from the number of 
included residues, we may safely assume that a time scale of the conformational 
changes of these two subdomains of real kincsin is approximately the same or- 
der as that of the stepping motion (~ msec). In contrast, it is unlikely that 
the hydrophobic core unfolds on the stepping motion, because the time scale of 
the core unfolding should be much longer (~ sec or longer). The present results 
suggest that conformational changes of these functionally important regions are 
realized by a partial unfolding-refolding. We showed that SWI and SWII be- 
haves as independent folding unit in the absence of nucleotide. Binding to the 
nucleotide may induce coupling between these two folding units, which will be 
left for future study. 

The subdomain structure is largely determined by the topology of the native 
structure because three variations of coarse-grained Go-like models show similar 
subdomain structures. It has been pointed out that the cooperativity of folding 
transition depends on the detail of the interaction within the framework of the 
Go-like model, such as inclusion of many-body interaction [45] , interaction range 
|28j . and so on. However, as we found in previous study subdomain structure is 
rather insensitive to such detail [28] . 

We note that the large fluctuations of these subdomains arc totally different 
from the small fluctuations observed by, for example, the B-factor in the X- 
ray crystal structure and elastic network model. In fact, the B-factor of the 
subdomains is not particularly larger than that of other parts of the motor 
domain. Thus, the large fluctuations described by the present model are beyond 
the elastic regime and could not be captured by the B-factor or clastic network 
model. It is essential to take into account the chain connectivity explicitly for 
studying the function of kinesins. Myosin motor domain is more than twice the 
size of kinesins, but their nucleotide binding regions and switch I/II regions have 
similar topology to those of kinesins. This indicates that the basic properties of 
the free-energy landscape of myosin may also be similar. 

Before closing the discussion, some comments should be made on related 
simulations. All-atom molecular dynamics simulations of the kinesin motor 
domain have been performed [46] [47] . Wriggers et al. captured the nucleotide- 
dependent small fluctuations of the switch regions [35], and Minchardt et al. 
simulated conformational change of the switch I region [47]. Although, these 
results are in part consistent with the present result, these simulations are ex- 
tremely short in time (order of nanosecond) as compared to the time scale of 
the stepping motion (order of millisecond), since the all atom simulation of 
a large protein as kinesin is still a hard task. Recently, Hyeon and Onuchic 
performed the simulation of the kinesin dimer bounded on the microtubule by 
the Go-like model, and investigated the allosteric transition regulated by the 
inter-monomer strain through the neck-linker [48] , In contrast, we focused on 
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the structural fluctuations of a kinesin monomer in thermal equilibrium. We 
found that kinesin has two subdomains, which give rise to the conformational 
fluctuation of the functionally important regions. This result is supported by 
the recent analysis on experimental data [41] . The coarse-grained Go-like model 
made it possible to describe the large structural fluctuation that is relevant to 
the function. 
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(a) 
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Figure 1: (a) The X-ray structure of kinesin motor domain (PDB ID code lbg2). 
It consists of a central antiparallel /3-sheet of eight strands, sandwiched between 
three a-helices on either side. Orange: ADP, blue: helix 2 (91-122), strand 
5 (141-144, 171-173), helix 3 (176-189) and switch I (198-203), red: switch II 
(231-236), loop 11 (238-254), helix 4 $57-269), loop 12 (272-280), and helix 
5 (281-292), green: strand 1 (9-15), and helix 6 (306-320). This figure was 
prepared using the programs MOLSCRIPT [36] and Raster3D [37]. (b) The 
superposition of native structure of the lattice protein model of kinesin (blue) 
and the C" trace of the X-ray crystal structure (red). Accuracy of the fitting 
in terms of the root mean square deviation (rmsd) is 0.92 A. 




Figure 2: (a) Relative entropy S(E) of kinesin as a function of energy E. (b) 
The average energy E as a function of temperature T. (c) The radius of gyration 
R as a function of temperature T. 
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Figure 3: Free-energy landscape and structural fluctuation of C Q (7.5) model, 
(a) Eigenvalue distribution of PCA at the lower transition point, (b) The first 
two principal components PCI and PC2 at the lower transition point are shown. 
The residues belonging to the fluctuation unit of PCI are shown in red and those 
of PC2 are blue. Black indicates unstable residues with low contact formation 
probabilities. ADP (not included in the simulation) is orange, (c) Free-energy 
landscape at the lower transition point in two dimensional space spanned by the 
first two principal components PCI and PC2. (d) Energy dependence of the 
mean contact probability of the native contact for each residue. 
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Figure 4: Energy dependence of the mean contact probability of the native 
contact for each residue for (a) AU(4.5) and (b) AU(4.5)/MJ. 
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